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Abstract 

Let h be a three times partially differentiable function on IR™, let 
X = (Xi,...,X n ) be a collection of real- valued random variables and 
let Z = (Zi,...,Z„) be a multivariate Gaussian vector. In this arti- 
cle, we develop Stein's method to give error bounds on the difference 
Mh(X) — TEh(Z) in cases where the coordinates of X are not necessarily 
independent, focusing on the high dimensional case n — > oo. In order to 
express the dependency structure we use Stein couplings, which allows for 
a broad range of applications, such as classic occupancy, local dependence, 
Curie- Weiss model etc. We will also give applications to the Sherrington- 
Kirkpatrick model and last passage percolation on thin rectangles. 

Keywords: Stein's method; Gaussian interpolation; Curie- Weiss model; 
last passage percolation on thin rectangles; Shcrrington-Kirkpatrick 
model 

1 INTRODUCTION 

Let X and Z be random vectors in R™ and let h : IR™ — > R be a function of 
interest. A fundamental problem in probability theory is to obtain bounds on 
the quantity 

\Mh(X)-Mh(Z)\, (1.1) 

that is, to estimate the error when we replace X in lEih(X) by Z. If the error 
in is small irrespective of the detailed properties of X and Z then we will 
attribute to the function h a certain degree of universality, which means that 
the expected value only depends on certain basic properties of X and Z, such 
as the first few moments. 

Of particular interest is the case where Z is a Gaussian vector having the 
same (or a similar) covariance structure as X, and probably the most prominent 
occurrence of such universality is the central limit theorem. If X is a random 
vector, such that the Xi are independent of each other, centred and scaled such 
that J^VarXi = 1, and Z is a centred Gaussian vector with uncorrclatcd 
coordinates having the same variances as those of X, then it is well known that 
flUJ is small for functions of the form 
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where g : R — > R is not too irregular. A common heuristic says that the central 
limit theorem will also hold if independence is replaced by some form of "weak" 
dependence, and, furthermore, it can be expected that in many cases will 
be small for more general functions than (|1.2[) . Thus, in terms of dropping 
independence and considering more general functions than (11.21) . universality 
often can be observed beyond the standard setting of the central limit theorem. 

Even if the vector X is such that J^i Xi does not satisfy the central limit 
theorem, we can consider for functions more general than (11.21) . Let, 

for example, Xi be the number of balls that end up in the ith urn, when a 
fixed number of balls m is distributed independently among n urns. Clearly, 
y^- Xj = m and hence the sum does not satisfy a central limit theorem. As we 
shall see, it is nevertheless possible to give informative bounds on (jl.lj) in this 
case. 

Over the last decades, Stein's method has proved to be a very robust method 
to obtain explicit bounds for univariate and multivariate distributional approx- 
imations in cases where X exhibits non-trivial dependencies which are not of 
martingale type, but more combinatorial in flavour. Although Stein's method 
for the multivari ate normal distribu t ion ha s been successfully implemented in 
many places (see Reinert and Rollin ( 20091 ) and references therein), the depen- 



dence on the dimension of the results obtained so far may give the impression 
that the method is not suitable if the dimension grows linearly with the size of 
the problem. Indeed, this high-dimensional case has remained untackled until 
now. The purpose of this article is to close this gap. 

It is important to note at this point that the type of bounds that we will 
obtain will generally not imply that the marginal distributions of the individual 
coordinates will converge to a normal distribution. That is, the aim is not 
to prove convergence to a multivariate normal distribution. In the already 
mentioned example of classic occupancy, if the number of balls and urns are of 
the same order, then Xi will converge to a Poisson distribution with mean being 
equal to the limiting ratio lim m/n. Bounds on (jl.lj) will only be informative if 
they are smaller than the fluctuation of h(X), that is, if the bounds are smaller 
than E|/i(A)| (assuming here without loss of generality that ISh(Z) — 0), which 
is an obvious upper bound on (ll.ip . The bounds that we obtain for functions 
h that concentrate only on a few coordinates will typically have the same order 
as E|/i(X)| and hence will not — and often cannot — be informative. 

The remainder of the article is organised as follows. In the next section 
we will first discuss the key tools used in this article, in particular the funda- 
mental idea of using interpolation to estimate (jl.lj) . the Gaussian integration 
by parts formula and multivariate Stein couplings, leading to our main result, 
Lemma 12.11 In Section [3] we will then give some abstract and more concrete 
examples of Stein couplings, ranging from the independent case to more sophis- 
ticated dependencies. In Section H] we will discuss various applications. 

2 THEN MAIN IDEA 

An old idea to compare two quantities of interest is to find an interpolating 
sequence between them and to estimate the err or "along the way " of the inter- 



scqucncc between tnem and to estimate tne error along tnc way oi tnc inter- 
polation using the derivatives of h (paraphrasing Talaerandl (2010) on "Gaussian 



interpolation and the smart path method"). One of the earliest encounters of 
this idea is Lindeberg's method of telescoping sums. Define the interpolating 
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sequence 

Y{i) = {X 1 ,...,X i ,Z i+u ...,Z n ), (2.1) 

and write 

n 

E,h(X)-~Eh(Z) = ^r-E{h(Y(ij) - h(Y(i - 1))}; (2.2) 

i=l 

one can now bound the right hand side of (12.21) using Taylor expansion; this idea 



one can now bound the right hand side ot (\Z.Z\) using lay lor expansion; this idea 
has been successfully implemented bvlRotarl dl97 sfi.lChatterieel (I2007T). Mossel 



et al. (2010) and iTao and Vul (|2010r ) and surely by other authors. One of the 
important consequences of this approach is apparent when we look at (12.11) : it 
forces us to treat the coordinates of X in an ordered way. If the components 
of X are independent or, more generally, a martingale difference sequence, then 
this is of course desirable, and, indeed, qui te a few cen t ral limit theorems for 
mar tingales are based u pon (12.21) (see e.g. Bolthausenl (1982), Grama (1992) 



and iRinott and Rotar' And even if no such structure is apparent in 



the problem, one can sometimes arrange X such that it will be close enough to 
a martingale difference sequence. 

This approach, however, is not entirely satisfying. Often the martingale 
structure is "artificial" and one would like to make use of a more natural de- 
pendence structure in X, instead (rates of convergences being another reason 
to avoid martingales). And in many cases, one may have difficulties to linearise 
the problem at all. 

A key difference in Stein's method is to chose an interpolating sequence that, 
in contrast to Lindeberg's telescoping sum, treats the components of X sym- 
metrically. Note that (|2.ip essentially interpolates "along the coordinate axes" 
and the order of the axes determines the linearisation of the problem. Instead, 
we will interpolate between X and Z in a way that will linearly interpolate be- 
tween the matrices XX 1 and ZZ l . This appro ach is w e ll-kno wn f rom Gaussia n 
iterpolation and independently developed by Slepianl ( 1962h and ISteinl (1972), 



m 

although the technique used by Stein looks very much different from what is 
commonly referred to as Gaussian interpolation (the interpolation is "hidden" 
in the solution to the so-called Stein equation). 

Now, assume that X and Z are independent and define the interpolating 
sequence Y t = VtX + VI - tZ, < t < 1. Note that, if EI = EZ = 0, then 
TSY t = and, if Cov(A) = Cov(Z), then Cov(Y t ) = Cov(A) for all t (which 
may serve as an explanation why this particular Y t is a good choice). With hi 
being the partial derivative in the ith coordinate, we can write 

m(x)-m{z) = f %-m(Y t ) 

Jo ot 



(2.3) 



(differentiation in (|2.3|) corresponds to taking differences in (|2.2[) and integration 
replaces summation, but this is only a technical difference). One can easily see 
that, on the right han d side of (12.31). the coordinates are treated symmetrically. 
The result obtained bv lSlepianl f|l962h (known as Slepian's Lemma) assumes that 
X and Z are centred Gaussian vectors having a different covariance structure. 
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In this case, the Gaussian integration by parts formula 



m{zMz)} =^Cov(z i ,z j )m ij (z) (2.4) 

can be used t o estimate th e error on the right hand side of (|2 .3[) in terms of the 



covariances. 



Steinl (|1972f ). on the other hand, considered the univariate case, 
but where X is not Gaussian. Although (|2.4[) can still be used for Z, it needs 
to be replaced by an approximate version of (|2.4[) for X. 

In order to formalise this approximate version of the Gaussian integration 
by parts formula, we will make use of a multivariate gene ralisation of Stein 



couplings, which were introduced by Ch en and Rollinl (2010) in the univariate 



case, and then give more concrete constructions later on. Throughout this 
article summations will always range from 1 to n unless otherwise stated. 

Definition 2.1. Let (X, X',G) be a triple of n-dimensional random vectors. 
We say that the triple is a Stein coupling if, for any smooth enough function 
/ : R" -> R, we have 

Mj2Xifi(X) = Mj2Gi(M X ') - fi( X )) ( 2 - 5 ) 

i i 

whenever the involved expectations exist. 

Remark 2.2. If (X, X',G) is a Stein coupling, it follows from the definition 
that 

EX I = 0, E(Gi-Dj) = Cov(Xi,Xj), (2.6) 
for all i and j, where we let D = X' — X throughout this article. 

Equation (|2.5[) is the key condition to obtain an approximate Gaussian inte- 
gration by parts formula: if X and X' are close to each other, then the difference 
on the right hand side of (|2.5[) can be approximated by the corresponding deriva- 
tives, leading to a formula similar to (|2.4p . Hence, it is crucial that X' is only 
a small perturbation of X . 

The following result, although not difficult to prove, is crucial for our ap- 
proach. On one hand, it measures how closely X satisfies the Gaussian integra- 
tion by parts formula and, on the other hand, also compares the covariances of 
X and Z (which in this article we will mostly assume to be the same). To make 
things more transparent, we keep everything explicit in terms of the function h, 
instead of using the usual approach via Stein equation and its solution. 

Unless otherwise stated, we will assume throughout this article that 

EI = 0, VaiXi = af, E|X,;| 3 = t? < oo, f = supr,. (2.7) 

i 

We will denote by £ = (<7tj)i<i,j^n the covariance matrix of X, where cry = 
IS(XiXj), and we have an — of. 

Lemma 2.1. Let (X, X',G) be a Stein coupling. Let X" and D be n dimen- 
sional random vectors and let S be a random n x n matrix. Define D = X' — X 
and D' = X" — X . Assume that, for all k and I, 

M(G k D l \X) = -E(G k D l \X), E{S kl \X) = a kl . (2.8) 
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Let Z ~ MVN n (0, S) 6e independent of the previous random vectors. Then, for 
any three times partially differentiable function h, 

Eh(X) - Wi{Z) = - [ F,Rx(t)dt- - [ [ t 1/2 E 1 R 2 (t,s)dsdt 

2 Jo ; ; ^ioio (2g) 



JO JO 

where 

Rt(t) = J2(°kDi - s kl )h kl {Vtx" + VT~tz), 

k,l 

R 2 (t, u)=J2 (GkDi - S kl )D' m h klm (V~tX + uVtD' + s/T~tZ), 
R 3 (t, u) = G k DiD m humWiX + uVtD + VT~tZ), 

provided that Eiij(-) exists for i = 1,2,3. In particular, 

\]Eh(X) - ~Eh{Z)\ < |sup|E-Ri(t)| + isup|Ei? 2 (i,s)| + | sup \TSR 3 (t, s)\. 

t t,s 

It seems rather difficult at this point to convey the purpose of all the random 
vectors appearing in the lemma. Probably the best way to get an intuition for 
such co uplings is to go through the different applications given later on; we also 
refer to Chen and Rollin for the univariate case, where further examples 

are discussed. We note that finding the appropriate random vectors will usually 
require some trial and error. 

Remark 2.3. Let us make a few comments at this point. 

1. If (X, X' , G) is a Stein coupling and if (G, D) is independent of X" , then 
RRi(i) = 0. 

2. Except for the case of local dependence in 13.51 we will choose D = D. 

3. The result can be easily extended to include other error terms from the 
proof of the lemma under weaker conditions. We will use the following 
extension later on. If (X, X',G) is not a Stein coupling, then one can 
include a measure of how close (|2.5[) is satisfied; with 

R o(t) = J2i x * h k(^~ tx + VT^tz) - G k h k (Vtx' + VT~tz) 

k +G k h k (Vtx + VT~tz)}, 

an additional \ J Q -^]&Ro(t)dt would appear on the right hand side of 

4. If (X, X',G) is not a Stein coupling, the identities (|2.6[) (especially the 
second equation) need not hold any more and would have to be replaced 
by corresponding approximate versions. 

5. Note that the difference \G k Di — S k i \ in R2(t, u) can usually be estimated 
by \G k Di\ + \S k i\ without changing the rates of convergence. This is not 
the case for Ri(t), where more care is required. 
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Proof of Lemma \2.1\ Define the interpolating sequence Y t — y/tX + y/1 — tZ, 
< t < 1. Starting from (|2.3[) . and using (|2.4p and (|2.5p . we obtain 

r 1 d 

E/i(X) - E/i(Z) = / —Mh(Y t )dt 
Jo ot 

= \Sl ^ t Xkhk ^ ~ E ^=^M^)}* (2.10) 

= 5 / E {E ~ M^)) - E <^ W)}*, 

where Y( = y/tX' + y/1 — tZ. Let us recall the definition of Ri(t) and introduce 
two additional error terms: 

R 1 (t) = J2( G kDi- S kl )h kl (Yf), 

k,l 

Mt) ■= Y,( Skl - ^i)h kl (Y t ), R 5 (t) := J2 G k{ D i - Di)h kl (Y t ), 

k,l k,l 

where Y" = yftX" + y/T~tZ. Applying 

h k QTD - h k (Y t ) = f VtJ2D l h k i(Y t + sViD)ds 
Jo , 

to (|2.10[) . and adding and subtracting the terms from Ri(t), iLi(t) and -Rs(i) 
yields 

Eh(X) - Mh(Z) = \ [ e( f y^G k Dih kl {Y t + S yftD)ds-Y^a k ih kl (Y t )\dt 

2 Jo Uo u , u, J 



If 

2 Jo 



k,l k,l 

M{Ri(t) + R±(t) + R 5 (t)}dt 



\J Q E {E( Gfc ^ - S ki)(hki(Y t ) - h H {Yl'))^dt 

\j I ^\^G k D l {h kl {Y t + sVtD)-h kl {Y t ))^d S dt. 



Note that, under (j2T8|) . Ei? 4 (f) = Ei? 5 (i) = 0. Taylor expansion in the last 
two lines yields the final result; we refer to [Ch en and Rollinl for a more 

detailed exposition of the proof in the univariate case. □ 

We will use the following notation in the remainder of this article. We denote 
by || • || the supremum norm of functions. For k ^ 1 and a fc-times partially 
differentiable function / : R™ — > R, we let 

|/|fe= sup ||/u...ij| 

For functions g : R — > R, in order to make the formulas more readable, we will 
use the notation ||g'||, ||g"||, ■ • ■ , instead of using the equivalent \g\i, \g\2, 
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Remark 2.4. It will be sometimes the case that one is interested in comparing 
the distributions of f(X) and f(Z) for some specific function / : R" — > R. To 
do so, choose h(x) = g(f(x)) for g : 1R — ► R. Then, if is small for many 
functions g, then we can conclude that f(X) and f(Z) are close in distribution. 
If g is three times diffcrentiable, we record the useful estimates 

|ft| 2 <l/| 2 ||fl , ll + l/l?ll/ll, 

|ft|3<|/|3|b , ||+3|/| 1 |/| 2 || ff "|| + |/|?|b"'||. 

Remark 2.5. A particular function of interest is 

rn 

P =i 

for functions : R™ — > R, 1 < p ^ m. Define 7 fc = sup p |?/ p ) it is 
straightforward to check that 

\f\i<Pli, I/I2 ^/3 7 2 + 2/3 2 7l 2 , |/| 3 ^/373 + 6/3 2 7 i72 + 6/3 3 7l 3 . 



3 COUPLINGS 

Many of the Stein couplings discussed bv lChen and Rolhn (120101) can be adapted 
to the multivariate case: exchangeable pairs, size-biasing, local dependence, 
etc. Instead of generalising all of them here (which will be done elsewhere with 
emphasis on multivariate normal approximation for fixed dimension) we only go 
through a fe w of them explici t ly and instead present some other couplings not 
discussed bv lChen and Rollinl (l201dh . 



3.1 A theoretical result 

One may wonder if, given a pair (X, X'), there exists a G to make the triple 
(X, X ' G) a Stein c oupling . This question has been answered bv Chen and 
Rollin (l20inh * for the univariate case, but the construction given there can also 
be used in the multivariate setting. Let JF = a(X) be the cx-algebra induced by 
X and let T' = cr(X'). Define formally the sequence 

G = -X + E(X|J"') - E,{E{X\F')\F) + E(E(E(X| . 

If the sequence converges absolutely in each coordinate, then this will make 
(X,X',G) a Stein coupling. Indeed, E(G|J") = -X and E(G|J r ') = so that 
(|2.5p is satisfied. 

To motivate the choice of G used in the next few settings, consider the case 
where the coordinates of X are independent. Let I be uniformly distributed on 
{1, . . . , n}, independent of all else. Define the vector 1™ by 

X<£ ] = {l-5 ki )X k , 

where Sij is the Dirac delta function. Let X' = X^- 1 '; that is, X' is the vector 
where we have set a randomly chosen coordinate to 0. Denote by e$ the unit 
vector in direction i. Using independence of the coordinates, 

E(X|J-') = E(X« + ei Xi I XW) = X^ 
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and 

E(nF) = i£x« = (l-i)X 
n £ — ' 

i 

Hence, 

G = -X + X'-(l-i)X + (l-I)X'-(l-I) 2 X + (l-i) 2 X' + ... 
- -ejXj - (1 - I)e/Jf 7 - (1 - ^) 2 e/Xj - • ■ ■ = -ne/Xj. 



3.2 Independent coordinates 

In order to illustrate the method in a simple setting, we start with independent 
coordinates using (X,X',G) derived in the previous section. 

Theorem 3.1. Let X be as in (12.71) and assume the coordinates of X are 
independent. If Z is a vector of independent centred Gaussian random variables 
with the same variances as X , then 

\TZh(X) -Mh(Z)\ ^^J2 T ?W h ™W- 

i 

Proof. Let G l = -nS lI X l and X 1 = X" = jW, hence A = D[ = -8 u Xi. Let 
D = D and SV,- = nafSuSji. It is easy to see that (X, X', G) is a Stein coupling, 
that (|2.8p is satisfied and that (G,D) is independent of X"; the latter implies 
Ei?i(i) = (see Remark l2.3[) . The following estimates are immediate: 

I^WK^HMKct-EI^I+EI^I 3 ) ^2j2\\hn l \mx l \ 3 1 

i i 

|i? 3 WKl]llM|E|^| 3 . 

i 

Lemma \2 . 1 1 concludes the theorem. □ 

Using Lindeberg's telescoping sum and Taylor expansion, and noting that 
the first two moments of X and Z match, one easily obtains 

\m(x)-m(z)\ < ^(eix^ + ei^Hiimi < (1+ $ ) Z>^IM1- 

i i 

As usual for independent random variables, the constants obtained via Stein's 
method are larger than those obtained from other methods. But, of course, ap- 
plications with dependencies is the main purpose of using Stein's method. 



3.3 Weak dependence 

A simple way to measure how much a single coordinate Xi is influenced by 
the other coordinates is to look at the fluctuation of the conditional mean and 
variance of Xi. To this end, let X be as in (12.71) and define X^' as in 13.11 
Furthermore, let 

Mi(xW) =M( Xi \X% af{X^) =Var(X i |xW). 

Then we have the following. 
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Theorem 3.2. Let X be as in $ZJ$ and let Z ~ MVN„(0,£). Then 
\Eh(X)-~Eh(Z)\ 

< IN|Ek(* W )l + HM («* W ) 2 + E|o?(X«) - of I) 



(3.1) 



Proof. Define G, X' , X", 5 as in the proof of Theorem 13.11 In addition to 
the error terms R2 and R3, which can be bounded in the same way as for 
Theorem 13. 11 we have 

Ei? (i) =T&Y J Xih(ViX® + VT^tZ) =M^iii(X®)hi(>/iX® + y/T=iZ) 

i i 

(see Remark [273]) and 

Ei2i(t) = M^2(Xf - af)h u (VtX^ + ^/l~tZ) 

i 

= E^((X, - i*(X®)) 2 - a\ - iH(X®) 2 )hii(ViX® + VT~tz) 

i 

This easily leads to the final bound. □ 
Note that if the Xj are in dependent, Theorem 13.21 reduces to Theorem 13. II 



Gotze and Tikhomirovl ( 2006h assumed that fii(X l ) — almost surely to obtain 



convergence rates to the semi-circular law in random matrix theory under such 
dependence. 



3.4 Constant sum and symmetry 

Recall the classic occupancy problem from the introduction. The sum of the 
vector that describes the number of balls in each urn will be equal to the total 
number of balls and hence, itself, not satisfy a central limit theorem. This 
motivates us to consider general vectors X that satisfy 

n 
i 

almost surely. Note that this implies in particular that 

5>«i=0 (3.3) 

3 

for each i. 

To apply our method, we will need to make more assumptions. We say that 
a function h : R™ — > 1R is symmetric, if it is invariant under permutation of its 
coordinates. A random vector X = (Xi, . . . ,X n ) is called exchangeable if its 
distribution is invariant under permutation of the coordinates. 
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Theorem 3.3. Let X be a random vector satisfying (|2.7[) and (13.21) . Let Z 

MVN„(0, E). If h is symmetric, then 



\Mh(X)-Mh(Z)\ < \h\ 2 



Var(E^)l 1/2 + W3 (f E^ + ^l ^ 



If (Xi, . . . , X n ) is exchangeable then 



\Eh(X)-m(Z)\ sC \h\ 



1/2 



19|/i| 3 nTj\ (3.5) 



Proof. For a; € 1R™, let ir lfe € R n be the vector obtained by interchanging the 
ith. and fcth coordinate of x (if i = k then x %k = x). Note that, under symmetry 
of h, 

h % (x lk ) = h k (x), (3.6) 
respectively, under exchangeability, 

®{v(X t )h t (X* k )} =M{<p(X k )hi(X)}, (3.7) 

for any function tp for which the expectations exist. Furthermore, for (i, j, k, I) £ 
[n] 4 with 

i = k <^> j = I, (3.8) 
denote by sc* J a permutation of x such that, for symmetric /i, 

h ik {x ijkl ) = hji(x), (3.9) 

respectively, such that under exchangeability of X, 

M{ip(Xi,X k )h ik (X» kl )} =-E{i P (X j ,Xi)h ik (X)} (3.10) 

for all functions tp for which the expectations exist. Note that this permutation 
can be defined independently of x and h: if h is symmetric, keep [n] \ k, 1} 
fixed, map i h->- j and k \-> I (which is always well-defined because of (|3.8p ) and 
map the remaining numbers among each other in any arbitrary, but fixed way; 
respectively, if X is exchangeable, keep again [n] \ {i,j, k, 1} fixed, but map now 
j i y i and / i— > k and, again, map the remaining numbers among each other in 
any arbitrary, but fixed way. Let (I, J, K, L) be distributed on [n] 4 , such that 
(I, J, K, L) is uniform on [n] 3 and, given (7, J, K), L is uniform on [n] \ { J} if 
I K, and J = L if I = K; hence, (I, J, K, L) will always satisfy (|3.8p . Define 

._ j^IK jrll . j^IJKL 

and 

G fc = -nS k iX k , D k = D k , Sm = n 2 5 k i5iKO-kV, 

note that 

A = 8 U {X K - X/) + - Xif), D{ = 2 ^f 1 "" 1 ')' 

m€{/,J,K,L} 
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Fix t and let, for notational convenience, /. (x) = E/i. (y/ix + y/T^tZ), where • 
stands for i, ij or ijk. Clearly, E{Gfc/fc(JT)} = lEi{Xkfk(X)} . Under symmetry 
of h, using (|3.6I) and then (13.21) . we note that 

E^G fc / fc (X') = -nB{X I f I (X IK )} = -M{X T f K (X)} 

k 

=--EriiA(i)=o, 

n 

i.k 

whereas, under exchangeability of X, we can use (|3.7p instead to obtain 
E^G fc / fe (X') = -nE^/KX^)} = -nE{X K /i(X)} 

= -iE^X fe / i (X) = 0. 



n 



Hence, (X,X',G) is a Stein coupling under either condition. Now, 

Ei2i(t) = e£)(S h - G k Di)fki{X") 
k,i 

= E^[n 2 4/<Wfc* + 4/nX fe (<5 ;/ (X K - X,) + - Xi))]fu(X") 

k,l 

= n 2 Ma IK f IK (X") + nMX I (X K - X^fuiX") + nEX^Xj - X K )f IK (X"). 

We proceed assuming symmetric h; the proof under exchangeability is analogous 
and hence omitted; however, we will indicate the places where the arguments 
differ. Making use of (|3.9j) and 



n 2 TS{a IK f IK (X")} = n 2 ~E{a j K j j L (X)} 

= + ^^yE E E fri*) 

« 3 i 3,1^3 

(under exchangeability note that oik = <Jjl)- Furthermore, using (|3.9[) and 



nM{X I (X K - Xi)f u {X")} = n-E{X I {X K - Xj)fjj{X)} 

= ±Mj2Xi(X k -X i )f jj (X) 



i,j,k 



ii 



and 



nE{Xj(Xj - X K )f IK (X")} 

= n-E{Xj{X I -X K )f JL {X)} 

=^^,£, w x ' (x '- Xt)MX) 



i,j,k^i,l^j 



= ^ E E* 2 E - ^ny E E E 
= i E E^ E + ^rrT E E^ 2 E 

where for the last equality we used that ^~rn = TT^T - rT ( un der exchangeability 
use (pnU)) instead of Hence, 



|Efli(t)| < 



1 



n(n — 1) 
1 

+ - 

n 



e {(e^-e- 2 )eW 

e{(£^-£^) $>(*)} 



Var(^^ 



-,1/2 



This gives the first part of the result. Now, 
MR 2 (t,u) =MJ2 (Ski - GkD^D^fkUX + uD') 



k,l,m 

n 2 E J2 <riK{X'^-X m )f IKm (X + uD') 

me{I,J,K,L} 

-nE X K {X' l -Xi){X^-X m )fu m (X + uD') 

le{I,K},me{I,J,K,L} 



hence 



\MR 2 (t,u)\ < 8|/i| 3 f^|a y |+32|/i| 3 nf 3 



Similarly, 

MR 3 (t,u) = ^ G k DiD m h klm (X + uD) 



hence 



k,l,m 

nE Xl W - Xl ) ( X 'm - X m ) h Mm (X + uD), 

le{I,K},me{I,K} 

\ER 3 (t,u)\ < 16\h\ 3 n? 3 . 
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This yields (|3.4[) (under either symmetry of h or exchangeability of X); to obtain 
(13.51) simply note that cr^ = — (ri — l) _1 af for all j ^ « because of (|3.3[) and 
exchangeability. □ 

3.5 Local dependence 

Stein couplings f or thi s dependence structure have already been discussed by 
Chen and Rollinl f|2010h . based on similar decompositio ns that appeared in man y 



other places; we refer to the more detailed discussion bv lChen and Rollinl (|2010h . 
We give the multivariate version here. 

Let X = (Xi, . . . , X n ) be as in (|2.7|) . Assume that, for each i € [n] := 
{1, ... ,n}, there is a subset Ai C [n] such that Xa c and Xi are independent. 
Assume further that for each i £ [n] and j C Ai there is a subset C [n] such 
that Ai C Bij and A^b=. is independent of (Xi,Xj). 

Theorem 3.4. Let X as above. Let Z ~ MVN„(0, £). Then, for any three 
times partially differentiable function h, 

\m(x)-Eh(z)\^±J2Yl E (WiMXk\+nXiXjXk\)\\hi jk \\ 

+ E ElXiXj-Jffclll/nj-jkH < -f 3 nr)\h\ 3 

where -q = sup, EjeA s 

Proof. Let J be uniform on [n] and, given /, let J be uniform on Aj . Define the 
vectors X', X" , G and D and the matrix 5 as 

G k = -S kI nX k , X' k =I[k#Ai]X k , XZ=I[k#Bu]X k , 
Sm = nlAjlSkiSijo-ki, D k = -\Aj\S kJ X k . 

Note that X' is independent of G, which makes (X, X' , G) Stein coupling, sim- 
ilarly as for the independent case. Furthermore, X" is independent of (G,D), 
hence MRi(t) — 0. The final bound follows now easily from Lemma \2. II □ 

Note that an m-dependent sequence is a special case of local dependence: we 
have | Ai| = 1 + 2m and B^ ^ 1 + 3m. However, the crucial aspect here is that 
the exact structure of the dependence is only important in terms of the size of 
Ai and B^ . Any graph with maximal degree m that describes the dependence 
structure of X (that is, two subsets of vertices are independent if there is no 
edge between them) will have the upper bounds \AA ^ 1+m and \Bij\ ^ l+2m. 
In that case, 

?K2(m + l) 2 . (3.11) 

4 APPLICATIONS 

In this section, we will discuss two different types of applications. First, we will 
consider concrete functions h, for which we will determine under what kind of 
dependencies (jl.ip will be small. If we can control the first three derivatives of h, 
then we can analyse the universality of the given h with respect to dependence, 
for example for the different settings of the previous section. The first two 



13 



applications below will be of this type. We will analyse universality with respect 
to local dependence only, but it is clear that many of the other settings can be 
used instead. In the case of local dependence, we will be interested in how big the 
"neighbourhoods" Ai and Bij are allowed to become while keeping the bounds 
on (jl.ip small enough. We will use rj from Theorem 13.41 as a simple measure 
of neighb ourhood siz e , and hence dependence. These applications are closely 
related to IChatterieei ( 2005 ). Whereas in the first application of the SK- model 
the dependence will enter in a straightforward way, in the second application 
of last passage percolation on thin rectangles, an certain optimisation step will 
have to be recalculated, including the measure of dependence rj. 

As a second type of application, we will consider more concrete vectors X, 
for which we want to show that (jl.ll) is small for a large class of functions h. 
In this situation, the structure of the dependence of X will either fit into one 
of the abstract settings of the previous section (this will be the case for classic 
occupancy), or else, one will have to construct a Stein coupling from scratch; 
the latter will be the case for the Curie- Weiss model. 



4.1 Dependent environment in the Sherrington-Kirkpatrick spin glass 
model 

Consider the TV-spin system {—1, 1}^. To each configuration a € { — 1, 1} we 
assign the (random) Hamiltonian 

i<j 

where £ = (£ij)i<i<j<Ti is a family of random variables, which we call the 
environment. Given the environment £, we assign to each a the probability 



a H N (<r) 



where 
Let 

It was proved by iTalagrandl ( 20061 ) that Pjv(/3,£) — > poo(/3)> the solution of the 
Pa risi formula , if the are ind ependent standard Gaussians. Carmona and 
Hu (l2006h showed that the same limit holds if the Gaussians are replaced by 
independent copies of any random variable £ with E£ = and E|£| 3 < oo. We 
shall extend this results to dependent environments. To this end define 

where Yy, . . . ,Y n is any family of random variables with \Yi\ $J 1 for all i. 

Lemma 4.1. Let £ = (£i, . . . , £„) be a random environment such that E£j = 0, 
IE£ t 2 = 1 and IE | ^ i | 3 ^ f 3 < oo. Let g ~ MVN n (0,E) where S is the covariance 
matrix o/£. Then 

|ElogZ„(/3,£)-ElogZ n (/3, 5 )K5/3 3 f 3 n7 7 . (4.1) 
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Proof. Let h(£) — log Z n (f3, £); it is easy to see from Remark 12.51 that 

II Ml < 6/3 3 . 

(note that 72 = 73 = and 71 < 1 as \Yi\ < 1). Using Theorem I3T3I (|4~Tj) is 
immediate. □ 

The following statement is a direct consequence of Lemma l4.1l for n = N(N— 
l)/2 and /3 replaced by Z?^- 1 / 2 . 

Theorem 4.2. Assume i/ie environment £ satisfies the dependence structure of 
Theorem \3.4\ with rj = o(iV 1 / 2 ) and o~ij — for i 7^ J. Then 

bg^09, 0->- Poo 08). 

Consider a fixed m-regular graph G on the set of vertices Vjv = {{hj) '■ 
1 ^ i < j ^ ^V}- Let be i.i.d. centred random variables with finite third 
moments. Let 

Then it is straightforward to see that these ^ are centred and uncorrelated 
(note that ^ does not contain hij). Clearly, from (13. 1 1|) . rj ^ 2(m + l) 2 and 
hence we can apply Theorem 14.21 as long as m = o(iV 1 / 4 ). Noticing that (|4.ip 
is independent of the underlying graph, we obtain the following. 

Corollary 4.3. Let Gn be a sequence of random m^-regular graphs on Vn , 
where mjv = o(N 1 ^ 4 ). Then, with £ as above, 



1e g " logZ N (l3,O^Poo(P) 



almost surely. 



4.2 Last passage percolation for thin rectangles 

The following statements about smooth approximation of the maximum function 
is well-known (and easy to verify). 

Lemma 4.4. Let m be a positive integer. For each y € R m , let fo(y) = 
max{yi, . . . ,y m } and f e (y) = e log£\ e Vi l e . Then 

< feiy) - fo(y) < elogm. 
Consider functions y( p ' : R™ — > R, p = 1, . . . , m, and let 

P x = max y {p \x). (4.2) 



The following theorem is similar to a result obtained bv lChatterieel (|2005h . 
but now includes 77. 
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Theorem 4.5. Let P x be as above with linear functions y^'. Let X be a family 
of n centred random variables with finite third moments satisfying the depen- 
dence structure as in Theorem \3.4\ Let g : R — > 1R be three times differentiable. 
Then, for Z ~MVN n (0,E), 

\mPx)-Eg(P z )\ < (6||g'|| +6||g"|l + ll.9'"ll)" 1/ V /3 T7ilog(m) 2 / 3 . 
where 7i = sup KpS ; m |y (p) |i. 

Proof. Using the notation of Lemma 14.41 define the functions 
ho(x) = g(fo(x)), K(x) = g(f £ (x)). 

Clearly 

\h Q (x) - h e (x)\ ^ Hff'llelogm. 

We now use Remark 12.51 We clearly have 72 = 73 = 0. Furthermore, using 
again Lemma T4. 41 it is easy to check that, 

\h E \ 3 ^\fM9'\\+mi\M2h''\\+\fe\iy''\\ 

^s- 2 j 3 1 (6\\g'\\+6\\g"\\ + \\g'"\\). 

Thus, using Theorem l3.4l 

\m (X) - -Eh (Z)\ ^ \\g , \\e\ogm+\m £ (X)-E 1 h £ {Z)\ 

s: \\g'\\e\ogm + Ce' 2 f 3 nr n 3 (6\\g'\\ + 6\\g"\\ + \\g"'\\). 

Choosing e = rj 1 ^ 3 log(m) _1 ' 3 f7i, we obtain the final bound. □ 
Let us apply th i s resu lt to last passage percolation on thin rectangles along 



the lines of lSuidanl (|2006f ). Denote by tt an increasing path from (1,1) to (N, k) 
on the usual two dimensional lattice, where without loss of generality k ^ N. 
Let 



and let P x be as in (14.21) . where the maximum ranges over all increasing paths 7r. 
Hence, P x is the (standardized) longest increasing path between (1,1) and 
(N, k), where each lattice point (i, j) contributes xy to the length of the path. If 
X is an i.i.d. family of geometric or exponential random variables, then Johans- 



son 



(2000) showed that the properly centred and standardized Px will converge 



to F2 (the Tracy- Widom distribution for Gaussian unitary ensembles) if k = N. 
For general distributions, the same results is o nly known for thin rec t angles , 
that is, for k being of s mall er order than TV; see iBodineau and Marti 3 (l2005lV 



Baik and Suidanl (l2005h and lSuidanl (l2006h . In particular, if Xi have finite third 



moments, then k — 0(N a ) for a < 1/7. We shall expand this result to locally 
dependent X. If r\ remains bounded, we recover the same maximal order for k 
as in the independent case. If rj grows with N, however, the maximal order of 
k be will be affected. 

Corollary 4.6. Let X = (Xr, )i^i^Ar,i^j^fc be a collection of n = Nk random 
variables with mean and variance 1, satisfying the dependence structure of 
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Theorem \3.4\ and let Z ~ MVN„(0, X). Then, for any three times differentiable 
function g : R — > R, 

I™ „, p x F ,p , C(g,fV/ 3 fc 7 / 6 lQg(iV) 2 / 3 

Fg(Px) - Eg(Pz)| < . 

For some constant C(g,f). If a t j — for all i 7^ j, then the Px will converge 
to F2 if f remains bounded and if 

fc = o(7V 1 / 7 log(7V)- 4 /V 2/7 )- 

Proof. Clearly, 71 = k^-^N^ 1 / 2 . Furthermore, 

N + k\ fN\ k fN + k^ N+k 



m= { N )^\kj \ N 
As 

log(m) = fc(log(A0 - log(fc)) + (N + fc)(log(iV + k) - log(iV)) 
< fclog(A) +2k 

Applying Theorem 14.61 yields the final bound. □ 
4.3 Classic occupancy 

As mentioned in the introduction, we can obtain bounds on for the classic 
occupancy problem. Distribute m balls independently and uniformly among 
n urns. Let £j be the number of balls in urn i. Then, ~ Bi(r7i,rt~ 1 ) and 
Si & = m ; an( i therefore 

& - mn" 1 
\Jm\l — n L ) 

satisfies (|2.7p and Q3.2p and in addition of — 1. 

Theorem 4.7. Le£ A &e as a&owe cmd /ei Z ~ MVN n (Q,S). Then, for any 
three times partially differentiable function h, 

n 2 + Amn + 6 



|E/i(A) - m{Z)\ < (|/i| 2 + 19|/i| 3 )« 



mn{n — 1) 



Proof. We can apply (13.51) . as A is exchangeable and (|3.2j) is satisfied. It is 
straightforward to verify that 

Var X 2 = " 2 + t7 1)( ;r 3) > Cov(A ? , A 2 ^) = ~ %~ 2 ™t 6 
(see Lemma 14.81 below) . which implies 

n(n — 1) n fv2 v 2\ n 2 + Amn — 2m — 8n + 6 



Var J2 X i= n Var X i + 9 Cov (^i 2 ■ X 2 



2mn(n — 1) 
Furthermore, 



E|Ai| 3 < JEAfEAf < 



' n 2 + 3nm — 6n — 3m + 6 



mn 3 (n — 1) 

From this, the final bound follows. □ 
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We record here some identities for the mixed moments of the £j, which are 
easy to verify and needed in the above calculations. 



Lemma 4.8. Let £1 and £2 be the number of balls when distributing m balls 
uniformly and independently among n urns. Then 



E(£i 2 6) 



m m(m — 1) 
n n z 
m(m — 1) 

m(m — 1) m(m — l)(m — 2) 



m(ro — 1) m(m — l)(m — 2) m(m — l)(m — 2)(m — 3) 
EKlfa) = 5 + 2 ~ T " + — , 4 



4.4 Currie- Weiss model in the high-temperature regime 

Consider the n-spin system { — 1, 1}™ with Hamiltonian 



ff(<r) = --$> 

n — ' 



i<j 



To each configuration <r assign the probability 



where is the norma lising constant. This mo d el is well-known as Curie- 

Weiss model; we refer to Eichclsbacher and Lowe (2010) for a more detailed 



discussion of relevant literature. The authors of that article prove in particu- 
lar bounds in univariate central limit theorems for the total magnetisation of 
this and similar models. Here, instead, we will estimate the error when we re- 
place all the spins by corresponding Gaussian variables in the high-temperature 
regime f3 < 1; this, in particular, implies the central limit theorem for the total 
magnetisation. 

Previous approaches using Stein's method to analyse the magnetisa t ion o f 
such models make use of e xchangeab le pairs ( Eichelsbacher and Lowd ( 201dl ) 



and Chatteriee and Shad ( to appear )) which typically involves resampling a 



spin conditional on the other spins. It is worthwhile noting that the Stein 
coupling we will use does not require any resampling and, hence, does not form 
an exchangeable pair. 

To avoid confusion with the notation o~i for the spins, we will use s^ instead 
of o~ij to denote covariances in what follows. 

Theorem 4.9. Let X t = n" 1 / 2 CT J . Let Z ~ MVN„(0, £), where £ = (%-)i<i,;Kn 
with 

1 , P f - ■ 



n n 2 {l-P) : 

P 

n 2 (i -py 
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Then, for /3 < 1, 

for some constant Cp that only depends on ft. 
Proof. Define 



///, = - y^o-j, in - V <7, 
We recall the estimates 



E|m| fe Cfsn- k ' 2 - 



SCO 



Eichelsbacher and Lowe ( 2010l . Lemma 3.5). Let (/ , J, K, L) be distributed 



as in the proof of Theorem 13.31 independent of all else. Using the notation 
from 13. ll and the proof of Theorem 13.31 (with respect to exchangeability of X), 
define the vectors 

X ' = x (i) X" = X IJKL . 



Set D — D and define G as 

G fc = -n^SkK 
Define the matrix S as 



Ski — n 2 8kK5nSiK- 
Define now /. as in the proof of Theorem 13.31 Then, 

-E^G fc / fe pO 



k 

= n 3/2 E 



n( ^- ) +SK i y<r I -ftm)f K (X) 



and 



= n-V2 E ^^_X_ + 4 ^ _ ft m )f k (X) 



E^G fc / fc (X') 



A- 



(1-/5) 

- n -^ E g^ _A_ + 4^ (tanh^m,) - ftm) f k {X^). 
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Using the estimate 



| tanh(/3mj) — f3m\ | tanh(/3mj) — tanh(/3m)| + | tanh(/3m) — (3m\ 
(3 /3 3 |m| 3 



G 



we obtain 



(6 + /3 2 nE|m| 3 )(/3(l + /3)) 
6(1 -ftn 1 / 2 



and hence 



6(l-/3)n x /2 



1/2 



(c.f. Remark l2.3[) . Using exchangeability of X for the second equation, the fact 
that Ski = Sjl and also that sik = sjl, we obtain 

mJ2(GkDi-S H )f kl (X") 



k,l 



= ,/ J E^ [ - 



p 



n \n(l — /3) 



+ Ski ) {vi - /3m) <t/ - s JK ] /if,(X /Jifi ) 



= ,/ J E^ [ - 



/3 



n — /3) 



+ Sjl (oj - /3m) a- j - s JL Jki(X) 







n \n(l - /3) 

/3 



' (aj - /3m)<Tj - SjiJf ki (X) 

} 



n(l - /3) 



+ 1 (1 - fam) - s n 



WO ) 



+ K,g/K^))( l -^--),g i IS)} 



-E 



(/3 + n(l-/3))/3m 2 (n-l)/3 



Thus, 



Now 



n(l — /?) ^— ' n(n — 1) 



E 



2 ^ QlHi 



Ei? 2 (i,u) = E ^ (G k Di - S kl )D' m fkim(X + uD') 



> 3 / 2 E 

m£{I,J,K,L} 



/3 



n \n(\ — j3) 



Ski (07 - /3m) 07 - s IK 



x f K i m (X + uD') 
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From this, it is not difficult to see that 

|E%(*,«)|<^L 

(recall the definition of s,-j and note that the probability that I = K is l/n). 
Similarly, 

|EflB(*,«)|<3* 
Putting all the estimates together, yields the claim. □ 
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